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Summary 

Program evaluations can play an important role in public policy debates and in 
oversight of government programs, potentially affecting decisions about program 
design, operation, and funding. One technique that has received significant recent 
attention is the randomized controlled trial (RCT). There are also many other types 
of evaluation, including observational and qualitative designs. 

An RCT attempts to estimate a program’ s impact upon an outcome of interest 
(e.g., crime rate). An RCT randomly assigns subjects to treatment and control 
groups, administers an intervention to the treatment group, and afterward measures 
the average difference between the groups. The quality of an RCT is typically 
assessed by its internal, external, and construct validity. At the federal level, RCTs 
have been a subject of interest and some controversy in education policy and the 
George W. Bush Administration’s effort to integrate budgeting and performance 
using the Program Assessment Rating Tool (PART). In addition, in the 109 th 
Congress, pending legislation provides for RCTs (e.g., Sections 3 and 15 of S. 1934; 
Section 114 of S. 667 (Senate committee-reported bill); and Section 5 of S. 1129). 

Views about the practical capabilities and limitations of RCTs, compared to 
other evaluation designs, have sometimes been contentious. There is wide consensus 
that, under certain conditions, well-designed and implemented RCTs provide the 
most valid estimate of an intervention’s impact, and can therefore provide useful 
information on whether, and the extent to which, an intervention causes favorable 
impacts for a large group of subjects, on average. However, RCTs are also seen as 
difficult to design and implement well. There also appears to be less consensus about 
what proportion of evaluations that are intended to estimate impacts should be RCTs 
and about the conditions under which RCTs are appropriate. Many observers argue 
that other types of evaluations are necessary complements to RCTs, or sometimes 
necessary substitutes for them, and can be used to establish causation, help bolster 
or undermine an RCT’s findings, or in some situations validly estimate impacts. 
There is increasing consensus that a single study of any type is rarely sufficient to 
reliably support decision making. Many researchers have therefore embraced 
systematic reviews, which synthesize many similar or disparate studies. 

A number of issues regarding RCTs might arise when Congress considers 
making program evaluation policy or when actors in the policy process present 
program evaluations to influence Congress. Should Congress focus on RCTs in these 
situations, a number of issues might be considered, including an RCT’s parameters, 
capabilities, and limitations. In addition, Congress might examine the types of 
program evaluations that are necessary, question an evaluation’s definitions or 
assumptions, consider how to appropriately use evaluation information in its learning 
and decision making, evaluate how much confidence to have in a study, and 
investigate whether agencies have capacity to properly conduct, interpret, and 
objectively present evaluations. This report will be updated in the 1 10 th Congress. 
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Congress and Program Evaluation: An 
Overview of Randomized Controlled Trials 
(RCTs) and Related Issues 

Introduction 

Program evaluations can play an important role in public policy debates and in 
oversight of government programs, potentially affecting decisions about program 
design, operation, and funding. Many different techniques of program evaluation can 
be used and presented with an intention to inform and influence policy makers. One 
technique that has received significant recent attention in the federal government is 
the randomized controlled trial (RCT). This report discusses what RCTs are and 
identifies a number of issues regarding RCTs that might arise when Congress 
considers making program evaluation policy. For example, in the 109 th Congress, 
Section 3 of S. 1934 (as introduced) would establish a priority for RCTs when 
evaluating offender reentry demonstration projects; Section 114 of S. 667 (Senate 
Finance Committee-reported bill) wouldrequire RCTs for demonstration projects for 
low-income families; and Section 5 of S. 1 129 (as introduced) would call for RCTs 
for projects and policies of multilateral development banks. Issues regarding RCTs 
could also arise when actors in the policy process present specific program 
evaluations to Congress (e.g., in the President’s budget proposals) to influence 
Congress’s views and decision making. For many reasons, evaluations often merit 
scrutiny and care in interpretation. 

Before discussing RCTs in detail, the report places them in context by 
discussing (1) questions that program evaluations are typically intended to address, 
(2) how RCTs relate to other program evaluation methods, and (3) two major roles 
that Congress often takes with regard to program evaluation. The report next 
describes the basic attributes of an RCT, major ways to judge an RCT’s quality, and 
diverse views about the practical capabilities and limitations of RCTs as a form of 
program evaluation. In light of concerns about the reliability of individual studies 
to support decision making, the report also discusses how RCTs can fit into 
systematic reviews of many evaluations. The report next highlights two areas where 
RCTs have garnered recent attention — in education policy and the President’s 
annual budget proposal to Congress. Finally, the report identifies potential issues for 
Congress that could apply to the highlighted cases, oversight of other policy areas, 
and pending legislation. Because the vocabulary of program evaluation can be 
confusing, an appendix provides a glossary with definitions of selected terms. 

Congress, Program Evaluation, and Policy Making 

Key Questions about Government Programs and Policies. Citizens, 
elected officials, civil servants, interest groups, and many other participants in 
governance of the United States have an interest in the performance and results of 
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government programs and policies. To that end, stakeholders might want answers 
to many questions about programs and policies. For example, how should public 
policy problem(s) be defined? Is a program addressing some or all of the 
problem(s)? How well are federal programs and policies managed? What are they 
achieving? How can they improve? How are stakeholders affected? What 
unintended consequences might result? In the future, what activities and policies 
should the federal government pursue in order to best serve the public? What 
resources should be devoted to a program or policy? 

In addition, stakeholders might want answers to questions about the quality of 
evaluations that are brought to policy discussions, given that participants in the policy 
process will not always advertise weaknesses in studies that also happen to support 
their policy positions. What might those weaknesses be? Stakeholders might also 
ask how well federal agencies evaluate the programs they lead and administer. For 
example, what methods are appropriate to assess a given type of program or policy? 
Given the available quantity and quality of research, what degree of confidence 
should be placed in findings, to date? Do agencies have sufficient capacity to 
evaluate their programs? Are they performing the necessary types of evaluation? Do 
agencies have sufficient independence to credibly evaluate their programs and 
policies? What role should agencies play in evaluating programs? 

At times, many or all of these questions might be of interest to Congress and 
program stakeholders. All of them will typically be of interest to agency program 
managers and leaders. Therefore, any of these questions might be potential subjects 
of congressional oversight or law making. 

Program Evaluation and Informed Policy Making. In response to 
questions like those posed above, program evaluations can be introduced into policy 
discussions by actors in the policy-making process. These actors — who include 
organizations and individuals both inside and outside of government — might be 
interest groups, think tanks, academics, legislators, state or local governments, the 
President, federal agencies, or nonpartisan institutions. Many actors bring evaluations 
to policy discussions on their own initiative, oftentimes to emphasize the results or 
findings that they interpret to support their positions. Some actors (e.g., federal 
agencies) might bring evaluations in response to legislative or executive branch 
requirements. Depending on many circumstances, the evaluations that agencies bring 
might, or might not, support the policy views of the agency’s head or the President. 

When actors bring program evaluations into policy discussions, the studies will 
oftentimes use different approaches, because there are many possible ways to help 
answer the questions cited previously. The term program evaluation , therefore, has 
in practice been interpreted in several ways. For example, there is no consensus 
definition for the term program. In practice, the term has been used to refer to a 
government policy, activity, project, initiative, law, tax provision, function, or set 
thereof. Accordingly, this report uses the term program to refer to any of these 
things, as appropriate, that someone might wish to evaluate. 1 The term evaluation 



1 In program evaluation, the terms intervention and treatment are sometimes used as 
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